Broad Coverage Automatic Morphological Segmentation of German Words

نویسندگان

  • Thomas Pachunke
  • Oliver Mertineit
  • Klaus Wothke
  • Rudolf Schmidt
چکیده

A system for the automatic segmentation of German words into morphs was developed. The main linguistic knowledge sources used by the system are a word syntax and a morph dictionary. The syntax is written in the formalism of right linear regular grammars and comprises approximately 1,400 rules describing the set of those sequences of morph classes which underlie syntactically well formed words. The morph dictionary contains almost 11,000 morphs. Each morph is assigned to up to 6 morph classes. Statistical evaluations with 6000 test words showed that more than 99% of the segmented words got a correct segmentation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Regional Pronunciation Variants for Automatic Segmentation

The goal of this paper is to create an extended rule corpus with approximately 2300 phonetic rules which model segmental variation of regional variants of German. The phonetic rules express at a broad-phonetic level phenomena of phonetic reduction in German that occurs within words and across word boundaries. In order to get an improvement in automatic segmentation of regional speech variants, ...

متن کامل

Morphologically Based Automatic Phonetic Transcription

A system is described that automatically generates phonetic transcriptions for German orthographic words. The entire generative process consists of two main steps. In the first step, the system segments the words into their morphs, or prefixes, stems, and suffixes. This segmentation is very important for the transcription of German words, because the pronunciation of the letters depends also on...

متن کامل

Applying speech verification to a large data base of German to obtain a statistical survey about rules of pronunciation

In this paper we present a new research project to obtain a statistical survey of the pronunciation of German using an automatic system for segmentation and labeling of speech data and a very large data base of spoken German (GermAn Spoken in Public, GASP). It mainly involves the development of two components: a) An automatic system of speech veriication (PHONSEG) which produces a seg-mentation...

متن کامل

Reducing Light Change Effects in Automatic Road Detection

Automatic road extraction from aerial images can be very helpful in traffic control and vehicle guidance systems. Most of the road detection approaches are based on image segmentation algorithms. Color-based segmentation is very sensitive to light changes and consequently the change of weather condition affects the recognition rate of road detection systems. In order to reduce the light change ...

متن کامل

Reducing Light Change Effects in Automatic Road Detection

Automatic road extraction from aerial images can be very helpful in traffic control and vehicle guidance systems. Most of the road detection approaches are based on image segmentation algorithms. Color-based segmentation is very sensitive to light changes and consequently the change of weather condition affects the recognition rate of road detection systems. In order to reduce the light change ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1992